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Abstract — Self-adaptation is used in all main 
paradigms of evolutionary computation to increase 
efficiency. We claim that the basis of self-adaptation 
is the use of neutrality. In the absence of external 
control neutrality allows a variation of the search dis- 
tribution without the risk of fitness loss. 



I. Introduction 

It is well known that the genotype-phenotype map- 
ping in natural evolution is highly redundant (i.e., the 
mapping is not injective) and that neutral variations 
frequently occur. (A variation is called neutral if it 
alters the genotype but not the phenotype of an in- 
dividual.) The potential positive effects of neutrality 
have extensively been discussed ||, [l^, [l4| and there 
is growing interest in investigating how these results 
from biology carry over to evolutionary algorithms (EAs) 
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|. However, this dissents with 
the opinion that redundant encodings are inappropriate 
for real-world problems — a bijective genotype-phenotype 
mapping is often regarded as a design principle for effi- 
cient EAs (e.g., see @). 

Unlike neutrality, self-adaptation is an established key 
concept in evolutionary computation Q. We will point 
out that neutrality is a necessity for self-adaptation. 
And since there is no doubt about the benefits of self- 
adaptation for a wide range of problems, this is a strong 
argument for the usefulness of neutral encodings. Our 
point of view is that, in the absence of external control, 
neutrality provides a way to vary the search distribution 
without the risk of fitness loss. We propose to term this 
way of adapting the search strategy self-adaptation. This 
approach generalizes standard definitions and underlines 
the central role of neutrality. 

We reconsider two instructive examples from the litera- 
ture: The first example is from molecular biology, where 
it is shown how neutrality can increase the variability 
of certain regions in the genome and conserve the infor- 
mation in other regions. Using our definition, this is self- 
adaptation. The second one deals with self-adaptation in 
evolution strategies, a well-known example of successful 
self-adaptation in evolutionary computation. These algo- 
rithms rely on neutrality to adapt the search strategy and 
are frequently applied for solving real- world optimization 
tasks. 

Our arguments are based on recent efforts to provide 
a unifying view on the search distribution in evolution- 
ary algorithms and its (sclf-)adaptation plf. We intro- 



duce this formalism in the following section. Section [II 
describes the relation between self-adaptation and neu- 
trality and in section IV we present the two examples to 
illustrate our ideas. 

II. Modeling evolutionary exploration 

In this section, we outline that the individuals in the 
population, the genotype-phenotype mapping and the 
variation operators including their parameters can be re- 
garded as a parameterization of the search distribution. 
This is done in the framework of global random search 
and evolutionary optimization, although all considera- 
tions can be transfered to adaptation in natural evolu- 
tion, see Sec. UI-B. 



Evolutionary algorithms can be considered as a certain 
class of global random search algorithms. Let the search 
problem under consideration be described by a quality 
function $ : V — > 1R to be optimized. The set V denotes 
the search space. According to p3|, the general scheme 
of global random search is given by:[J 

1. Choose a probability distribution P^p on V . 

2. Obtain points g± , . . . , by taking A samples from 
the distribution Pp . Evaluate <& (perhaps with ran- 
dom noise) at these points. 

3. According to a fixed (algorithm dependent) rule con- 
struct a probability distribution Pp +1 ' on V. 

4. Check for some appropriate stopping condition; if the 
algorithm has not terminated, substitute t := t+1 and 
return to Step 

The core ingredient of this search scheme is the search 
distribution Pi, , also called exploration distribution, on 
the search space. Random search algorithms can dif- 
fer fundamentally in the way they represent and alter 
the search distribution. Typically, the distribution is 
represented by a semi-parametric model. The choice of 
the model determines the exploration distributions the 
search algorithm can represent, which is in general only 
a small subset of all possible probability distributions on 
V . For example, let V = M. Then the class of repre- 

1 This scheme does not account for all kinds of evolutionary com- 
putation. For example, if the evolutionary programming scheme 
is used, where each parent generates exactly one offspring [0], or 
recombination is employed, then the EA is better described as op- 



erating on a search distribution P^l on V* 



However, in these 



cases Pp' can still be derived from P^, x - 
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scntable distributions may be given by the class of normal 



densities, p-p{x;m,a) = 



■ exp 



f (x-m) 2 \ 



where 



is the expectation and a 2 the variance. This equation 
defines a parameterization of the search distribution, it 
maps the parameters (m, a) £ R 2 into the set of proba- 
bility distributions on V (see Fig. ^). A global random 
search algorithm alters its exploration distribution (see 
Step |3|) by changing such parameters. 

We believe that two of the most fundamental questions 
in evolution theory can be stated as: 

1. How is the exploration distribution parameterized? 

2. How is the exploration distribution altered? 

In the framework of evolutionary systems, we identify 
each element of V with certain phenotypic traits of in- 
dividuals and call V the phenotype space. Each pheno- 
type p, i.e., element of V, is encoded by a genotype g, 
which is an element of a genotype space Q . The map- 
ping <f> : Q —>■ V is called genotype-phenotype mapping. 
If <j) is not injective we speak of a neutral encoding. A 
number of genotypes , . . . , g$ , the parents, are stored 
in a population. The superscript indicates the iteration 
of the algorithm, i.e., the generation. In each genera- 
tion, A offspring gf \ . . . ,g^ are generated by applying 
stochastic and / or deterministic variation operators. Let 
the probability that parents g 1 , . . . ,g„ £ Q generate an 
offspring g £ Q be described by the probability distribu- 
tion Pg (g | g 1 , . . . ,g -a). This distribution is addition- 
ally parameterized by some external strategy parameters 
6 £ O. Examples of such externally controlled param- 
eters include the probabilities that certain variation op- 
erators are applied and parameters that determine the 
mutation strength. We call the probability 
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that in generation t an offspring g is created the explo- 
ration distribution on Q at generation t. 



The genotype-phenotype mapping <f> lifts P(p from the 



genotype space onto the phenotype space: 

Vp£V: p£\p)= Y, P eV) 

g'&<t>- l {p) 



where <j)~ 1 (p) := {g' £ Q \ 4>(g') = p} is called the neutral 
set of p £ V p5f . Thus, the genotype space Q together 
with the genotype-phenotype mapping </>, the variation 
operators, and external parameters can be regarded as 
a parameterization of the exploration distribution on the 
search space V. We refer to |nj for more details on this 
way of formalizing evolutionary exploration. 

Also algorithms recently developed in the field of evolu- 
tionary computing can be captured by this point of view: 
Miihlenbein et al. |l6| and Pelikan et al. |^9| parameter- 
ize the exploration density by means of a Bayesian de- 
pendency network in order to introduce correlations in 



the exploration distribution. The CMA evolution strat- 
egy proposed by Hansen and Ostermeier Q adapts a 
covariance matrix that describes the dependencies be- 
tween real- valued variables in the exploration density. In 
the remainder of this paper, we focus on a different ap- 
proach that might be considered biologically more plausi- 
ble, namely to utilize an appropriate genotype-phenotype 
mapping to parameterize the exploration density Pp . It 
does not explicitly encode correlations but models such 
interactions indirectly via the genotype-phenotype map- 
ping. What we will ask for in the next section is the way 
it allows for self-adaptation of the exploration strategy. 

III. Self-adaptation 

A. Introduction 

The ability of an evolutionary algorithm to adapt its 
search strategy during the optimization process is a key 
concept in evolutionary computation, see the overviews 
[[l], p|. Online adaptation of strategy parameters is 
important, because the best setting of an EA is usually 
not known a priori for a given task and a constant search 
strategy is usually not optimal during the evolutionary 
process. One way to adapt the search strategy online 
is self-adaptation, see || for an overview. This method 
can be described as follows Q: "The idea of the evo- 
lution of evolution can be used to implement the self- 
adaptation of parameters. Here the parameters to be 
adapted are encoded into the chromosomes and undergo 
mutation and recombination. The better values of these 
encoded parameters lead to better individuals, which in 
turn are more likely to survive and produce offspring and 
hence propagate these better parameter values." In other 
words, each individual not only represents a candidate 
solution for the problem at hand, but also certain strat- 
egy parameters that are subject to the same selection 
process — they hitchhike with the object parameters. 

Self-adaptation is used in all main paradigms of evo- 
lutionary computation. The most prominent examples 
stem from evolution strategies, where it is used to adapt 
the covariance matrix of the mutation operator, see 
Self-adaptation is employed in evolutionary 



(2) Sec. |IV-B 



programming for the same purpose, but also in the orig- 
inal framework of finite-state machines fio)] . In genetic 
algorithms, the concept of self-adaptation has been used 
to adapt mutation probabilities || and crossover oper- 
ators |23|]. Self-adaptive crossover operators have also 
been investigated in genetic programming Q . In the fol- 
lowing, we will propose an alternative definition of self- 
adaptation, which includes these approaches as special 



B. Neutrality and self- adaptation 

Following the formalism we developed in the previous 
section, a variation of the exploration density P^ , cor- 
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responding to Step || in the global random search algo- 
rithm, occurs if 

1. the parent population (g x , . . . ,Q^)) varies, or 

2. the external strategy parameters 6^ vary, or 

3. the genotype-phenotype mapping <f) varies. 

From a biological point of view, one might associate en- 
vironmental conditions (like the temperature, etc.) with 
external parameters that vary the mutation probabilities 
or the genotype-phenotype mapping. In most cases one 
would not consider them as subject to evolution.^ Some- 
times mechanisms that adapt the exploration distribu- 
tion by genotype variations are regarded as examples of 
adaptive genotype-phenotype mappings. To our minds, 
this is a misleading point of view. For example, t-RNA 
determines a part of the genotype-phenotype mapping 
and it is itself encoded in the genome. However, the 
genotype-phenotype mapping should be understood as 
a whole — mapping all the genotype (including the parts 
that code for the t-RNA) to the phenotype, such that it 
becomes inconsistent to speak of genotypic information 
parameterizing the genotype-phenotype mapping. 

The same arguments also apply in the context of evo- 
lutionary computation and thus we consider only option 
1 as a possibility to vary the exploration distribution in 
a way t hat is itself subject to evolution in the sense of 
section QI-A. However, if the genotype-phenotype map- 



ping is injective, every variation of genotypes alters phe- 
notypes and bears the risk of a fitness loss. Hence, we 
conclude: 

In the absence of external control, only neutral genetic 
variations can allow a self- adaptation of the exploration 
distribution without changing the phenotype, i.e., without 
the risk of loosing fitness. 

A neutral genetic variation means that parent and off- 
spring have the same phenotype. For instance, con- 
sider two genotypes g 1 , g 2 in a neutral set. Neglect 
crossover and assume that the probability for an off- 
spring g of a single parent «/j is given by Pg(g | gi,0). 
The two genotypes may induce two arbitrarily different 
exploration distributions "around" the same phenotype 
p = (^{g^ = <t>{g 2 ), see Fig. [I]. Transitions between these 
genotypes allow for switching the exploration strategy. 
In general, the variety of exploration densities that can 
be explored in a neutral set (f>~ 1 (p) is given by 



{P< P (p\g i ;e)\g i e4>- 1 (p)} 



(3) 



2 Counter-examples that go far beyond the scope of our formal- 
ism are, for instance, the embryonic environment (uterus) and the 
inherited ovum, or individuals whose behavior have an influence 
on mutations (e.g., sexual behavior influencing crossover) or on the 
embryonic development (e.g., a mother taking drugs). All of these 
influences might be considered as subject to evolution. In partic- 
ular the interpretation of an ovum is a critical issue. Should one 
regard it as part of the genotype or as an inherited "parameter" of 
the genotype-phenotype mapping? 
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Fig. 1. Two different points g lt g 2 in Q are mapped onto the same 
point in "P. The elliptic ranges around the points illustrate the 
exploration distributions by suggesting the range of probable 
mutants. Thus, the two points g 1 , g 2 belong to one neutral set 
but represent two different exploration strategies on the search 
space V. 



C. Discussion 

It is widely accepted that changing the genotypes with- 
out significantly changing the phenotypes in the pop- 
ulation is a key search strategy in natural evolution 
[ p"2"l |j~4] , p4| . We emphasize that under the stated assump- 
tions, neutrality is even a necessity for self-adaptation 
of the search strategy. Thus, we propose to define self- 
adaptation as the use of neutrality in order to vary the 
exploration strategy. 

Existing approaches to self-adaptation developed in 
the realm of evolutionary computation can be embed- 
ded in this point of view; only the style in which the 
notion of self-adaptation was originally introduced is dif- 
ferent. Usually one refers to "strategy parameters" that 
typically control the mutation operators (i.e., the map- 
ping g i i— > Pg(g | g^O)), in contrast to "object param- 
eters" describing fitness relevant traits. In the case of 
sd/-adaptation these strategy parameters are considered 
as parts of the genotype. In this view, strategy parame- 
ters are neutral, i.e., altering them does not change the 
phenotype. In turn, we regard neutral genotypic vari- 
ables as potential strategy parameters. There still is a 
slight difference: In case of standard self-adaptation, al- 
tering a strategy parameter is phcnotypically neutral and 
has no other implication than a change of the exploration 
strategy. In contrast, two or more genetic variables may 
be phenotypically non-neutral but a specific simultaneous 
mutation of them might be neutral and induce a change 
of the exploration density. Hence, a neutral set is a gen- 
eral concept for self-adaptation, which is not bound to the 
idea of single loci being responsible for the search strat- 
egy. A good example is that also topological properties of 
the search space (i.e., neighborhood relations/the set of 
most probable mutations of a phenotype) may vary along 
such a neutral set if the genotype-phenotype mapping is 
chosen similar to grammars |3lJ — which seems hard to 
realize by explicit strategy parameters. 

We believe that allowing for a self-adaptive exploration 
strategy is the main benefit of neutral encodings, as 
stated in [^7j : "For redundancy to be of use it must allow 
for mutations that do not change the current phenotype, 
thus maintaining fitness, and which allow for moves to 
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areas where new phenotypes are accessible". Changing 
the exploration distribution corresponds to the ability of 
reaching new phenotypes.^] 

In this view, neutrality is not necessarily redundant: 
Different points in a neutral set can encode different 
exploration strategies and thus different information; a 
genotype encodes not only the information on the phe- 
notype but also information on further explorations. 
One might state it like this: Although a non-injective 
genotype-phenotype mapping can be called redundant, 
the corresponding mapping from genotype to exploration 
distribution may in general be non-redundant. 

IV. Examples 

A. Codon Bias in natural evolution 

An intriguing study of the interrelationship between 
neutrality and self-adaptation in nature is the one by 
Stephens and Waelbroeck |}(| . They empirically analyze 
the codon bias and its effect in RNA sequences of the HI 
virus. In biology, several nucleotide triplets eventually 
encode the same amino acid. For example, there are 9 
triplets that are mapped to the amino acid Arginine. If 
Arginine is encoded by the triplet CGA, then the chance 
that a single point mutation within the triplet is neutral 
(does not chance the encoded amino acid) is 4/9. In con- 
trast, if it is encoded by AGA, then this neutral degree is 
2/9. Now, codon bias means that, although there exist 
several codons that code for the same amino acid (which 
form a neutral set), HIV sequences exhibit a preference 
on which codon is used to code for a specific amino acid. 
More precisely, at some places of the sequence codons 
are preferred that are "in the center of this neutral set" 
(with high neutral degree) and at other places codons 
are biased to be "on the edge of this neutral set" (with 
low neutral degree). It is clear that these two cases in- 
duce different exploration densities; the prior case means 
low mutability whereas the latter means high mutabil- 
ity. Stephens and Waelbroeck go further by giving an 
explanation for these two (marginal) exploration strate- 
gies: Loci with low mutability cause "more resistance to 
the potentially destructive effect of mutation", whereas 
loci with high mutability might induce a "change in a 
neutralization epitope which has come to be recognized 
by the [host's] immune system" pPf . 

B. Self- adaptation in evolution strategies 

In this section, we describe a rather simple example 
of self-adaptation in evolution strategies.^ The candi- 
date solutions for the problem at hand are represented by 

3 However, we think that there might be at least one additional 
case where neutrality can be useful, namely when it introduces — by 
chance — a bias in the search space, such that desired phenotvpxs 
are (in global average) represented more often than other ones 13 ■ 

4 More sophisticated and efficient algorithms exist for dealing with 
real- world optimization tasks, e.g., the derandomized adaptation as 
proposed in [till, Il8l . 



real- valued vectors x £ V = Q C R™. These real- valued 
object parameters are mutated by adding a realization 
of a normally distributed random vector with zero mean 
[ pT| p6| . The symmetric n x n covariance matrix of this 
random vector is subject to self-adaptation. To describe 
any possible covariance matrix, n(n + l)/2 strategy pa- 
rameters are needed. However, often a reduced number 
of parameters is used. In the following, let the covariance 
matrix be given by (el) 2 , where / = diag(l, . . . , 1) is the 
identity matrix and cr £ R™ corresponds to n strategy 
parameters that describe the standard deviations along 
the coordinate axes. 

Let \x denote the size of the parent population. Each 
generation, A offspring g\ , i = 1, . . . , A are generated, 
(/i, A)-selection is used, i.e., the fi best individuals of 
the offspring form the new parent population. An in- 
dividual gf ] £ R 2n can be divided into two parts, 



ct\ li ), the object variables x^' £ R" and 
the strategy parameters er^ £ H". 

For each offspring gf , two indices a,b £ {1, ...,fJ>} 
are selected randomly, where a determines the parent 
and b its mating partner. For each component k — 
l,...,n of the offspring, the new standard deviation 
af' — (<t|*i, . . . , a^l) is computed as 



M - 



Here, Q*2 ~ A/"(0, 1) is a realization of a normally dis- 
tributed random variable with zero mean and variance 
one that is sampled anew for each component i for each 
individual, whereas C ~ A/"(0, 1) is sampled once per 
individual and is the same for each component. The log- 
normal distribution ensures that the standard deviations 
stay positive. For the mutation strengths r oc l/y^/n 
and t' cx l/\/2n are recommended |^6|, Q. It has been 
shown that recombination of the strategy parameters is 
beneficial (e.g., (HI). Equation (|]) realizes an interme- 
diate recombination of the strategy parameters. 

Thereafter the objective parameters are altered using 
the new strategy parameters: 



a, A; 



■a (t) 

"b,k 



■ exp 



it) 



(4) 



.(*) _ s(<) 
v i.k ~ ^a,k 



W it) 
a i,k z i,k' 



(5) 



where 



it) 



Af(0, 1). For simplicity, we do not employ 
recombination of the objective parameters. An exam- 
ple of this strategy compared to algorithms without self- 
adaptation is shown in Fig. |3| The adaptive strategy 
performs considerably better than the other methods, as 
known from many theoretical results and empirical stud- 
ies. However, we would like to underline that the self- 
adaptive EA uses a highly redundant encoding instead 
of a one-to-one genotype-phenotype mapping — the geno- 
type space has twice the dimensionality of the phenotype 
space. The neutrality does not alter the distribution of 
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Fig. 2. An example for parameterizing the exploration distribu- 
tion in evolution strategies: Consider the search space V = R, 
fj, = 3 parent individuals, and no recombination. The par- 
ents represent the object parameters —2, —0.25, and 3.5. The 
offspring is generated in the following manner: First, one ran- 
domly chosen parent is reproduced. Second, the reproduced 
individual is mutated by adding a realization of a normally 
distributed random number with variance one and expectation 
zero. Hence, the resulting search distribution (the joint gener- 
ating distribution |p2|) is a multimodal mixture of Gaussians. 



the fitness values, it just allows the search strategy to 
adapt. 

V. Conclusion 

Neutrality is a necessity for self-adaptation. Actually, 
the design of neutral encodings to improve the efficiency 
of evolutionary algorithms is a well-established approach: 
strategy parameters are an example of neutrality. Hence, 
there already exists clear evidence for the benefit of neu- 
trality. 

The notion of neutrality provides a unifying formalism 
for embedding approaches to self-adaptation in evolution- 
ary computation. But it also inspires new approaches 
that are not bound to the idea of only single genes being 
responsible for the exploration strategy. Generally, any 
local properties of the phenotypic search space — metric 
or topological — may vary along a neutral set. An exam- 
ple are grammar-based genotype-phenotype mappings, 
for which different points in a neutral set represent com- 
pletely different topologies (and thus exploration strate- 
gies) on the phenotype space. 

On the other hand, the benefits of neutrality in natural 
evolution can be better understood when the simple — 
but concrete and well-established — paradigms of self- 
adaptation and neutrality in evolutionary computation 
are taken into account. As we exemplified with the sec- 
ond example, these computational approaches may serve 
as a model to investigate the relation between evolution- 
ary progress, neutrality, and self-adaptation. 

Finally, let us recall that neutrality is not necessarily 
equivalent to redundancy: in general, a genotype encodes 
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IV- 



we compare evolution strategies as de- 
with and without self-adaptation on a 



simple test problem: a 100-dimensional sphere model, <J>(a:) 
||cc||. First, the EA is run without self-adaptation, each Oi is 
set to one. After that, we employ self-adaptation as described, 
where the Oi are also initialized to one. Based on these results, 
we determine the average standard deviation trt (averaged over 
all parents, all generations, and all trials). Then we repeat the 
experiments without self-adaptation but fix each cr; = o-t _ The 
fitness trajectories of the best individual averaged over 50 tri- 
als are shown in the upper plot (a^ = 0.0125). In case of the 
self-adaptive EA the fitness curves show steps. The lower plot 
shows a typical single trial and the corresponding average of the 

standard deviations || («•)« || = ( £™ =a (i £f=i *J) * ) ^ 
It becomes obvious that each step in the fitness trajectory cor- 
responds to a change in the search strategy. In general, the 
mutation strength decreases, but sometimes a larger step size 
takes over the population and leads to a high fitness gain. In 
the continuum, the probability that an offspring has exactly the 
same phenotype, i.e., represents the same object parameters, 
is not measurable. Further, even small changes in the geno- 
type are relevant for selection due to the use of a rank-based 
selection scheme. Hence, drifting along a neutral network, i.e., 
traversing the search space by a sequence of (neutral) muta- 
tions that do not alter the phenotype, does not occur. 
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not only the information on the phenotype but also in- 
formation on further explorations. 
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